Search Results: "Mattia Rizzolo"

5 September 2021

Reproducible Builds: Reproducible Builds in August 2021

Welcome to the latest report from the Reproducible Builds project. In this post, we round up the important things that happened in the world of reproducible builds in August 2021. As always, if you are interested in contributing to the project, please visit the Contribute page on our website.
There were a large number of talks related to reproducible builds at DebConf21 this year, the 21st annual conference of the Debian Linux distribution (full schedule):
PackagingCon (@PackagingCon) is new conference for developers of package management software as well as their related communities and stakeholders. The virtual event, which is scheduled to take place on the 9th and 10th November 2021, has a mission is to bring different ecosystems together: from Python s pip to Rust s cargo to Julia s Pkg, from Debian apt over Nix to conda and mamba, and from vcpkg to Spack we hope to have many different approaches to package management at the conference . A number of people from reproducible builds community are planning on attending this new conference, and some may even present. Tickets start at $20 USD.
As reported in our May report, the president of the United States signed an executive order outlining policies aimed to improve the cybersecurity in the US. The executive order comes after a number of highly-publicised security problems such as a ransomware attack that affected an oil pipeline between Texas and New York and the SolarWinds hack that affected a large number of US federal agencies. As a followup this month, however, a detailed fact sheet was released announcing a number large-scale initiatives and that will undoubtedly be related to software supply chain security and, as a result, reproducible builds.
Lastly, We ran another productive meeting on IRC in August (original announcement) which ran for just short of two hours. A full set of notes from the meeting is available.

Software development kpcyrd announced an interesting new project this month called I probably didn t backdoor this which is an attempt to be:
a practical attempt at shipping a program and having reasonably solid evidence there s probably no backdoor. All source code is annotated and there are instructions explaining how to use reproducible builds to rebuild the artifacts distributed in this repository from source. The idea is shifting the burden of proof from you need to prove there s a backdoor to we need to prove there s probably no backdoor . This repository is less about code (we re going to try to keep code at a minimum actually) and instead contains technical writing that explains why these controls are effective and how to verify them. You are very welcome to adopt the techniques used here in your projects. ( )
As the project s README goes on the mention: the techniques used to rebuild the binary artifacts are only possible because the builds for this project are reproducible . This was also announced on our mailing list this month in a thread titled i-probably-didnt-backdoor-this: Reproducible Builds for upstreams. kpcyrd also wrote a detailed blog post about the problems surrounding Linux distributions (such as Alpine and Arch Linux) that distribute compiled Python bytecode in the form of .pyc files generated during the build process.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb made a number of changes, including releasing version 180), version 181) and version 182) as well as the following changes:
  • New features:
    • Add support for extracting the signing block from Android APKs. [ ]
    • If we specify a suffix for a temporary file or directory within the code, ensure it starts with an underscore (ie. _ ) to make the generated filenames more human-readable. [ ]
    • Don t include short GCC lines that differ on a single prefix byte either. These are distracting, not very useful and are simply the strings(1) command s idea of the build ID, which is displayed elsewhere in the diff. [ ][ ]
    • Don t include specific .debug-like lines in the ELF-related output, as it is invariably a duplicate of the debug ID that exists better in the readelf(1) differences for this file. [ ]
  • Bug fixes:
    • Add a special case to SquashFS image extraction to not fail if we aren t the superuser. [ ]
    • Only use java -jar /path/to/apksigner.jar if we have an apksigner.jar as newer versions of apksigner in Debian use a shell wrapper script which will be rejected if passed directly to the JVM. [ ]
    • Reduce the maximum line length for calculating Wagner-Fischer, improving the speed of output generation a lot. [ ]
    • Don t require apksigner in order to compare .apk files using apktool. [ ]
    • Update calls (and tests) for the new version of odt2txt. [ ]
  • Output improvements:
    • Mention in the output if the apksigner tool is missing. [ ]
    • Profile diffoscope.diff.linediff and specialize. [ ][ ]
  • Logging improvements:
    • Format debug-level messages related to ELF sections using the diffoscope.utils.format_class. [ ]
    • Print the size of generated reports in the logs (if possible). [ ]
    • Include profiling information in --debug output if --profile is not set. [ ]
  • Codebase improvements:
    • Clarify a comment about the HUGE_TOOLS Python dictionary. [ ]
    • We can pass -f to apktool to avoid creating a strangely-named subdirectory. [ ]
    • Drop an unused File import. [ ]
    • Update the supported & minimum version of Black. [ ]
    • We don t use the logging variable in a specific place, so alias it to an underscore (ie. _ ) instead. [ ]
    • Update some various copyright years. [ ]
    • Clarify a comment. [ ]
  • Test improvements:
    • Update a test to check specific contents of SquashFS listings, otherwise it fails depending on the test systems user ID to username passwd(5) mapping. [ ]
    • Assign seen and expected values to local variables to improve contextual information in failed tests. [ ]
    • Don t print an orphan newline when the source code formatting test passes. [ ]

In addition Santiago Torres Arias added support for Squashfs version 4.5 [ ] and Felix C. Stegerman suggested a number of small improvements to the output of the new APK signing block [ ]. Lastly, Chris Lamb uploaded python-libarchive-c version 3.1-1 to Debian experimental for the new 3.x branch python-libarchive-c is used by diffoscope.

Distribution work In Debian, 68 reviews of packages were added, 33 were updated and 10 were removed this month, adding to our knowledge about identified issues. Two new issue types have been identified too: nondeterministic_ordering_in_todo_items_collected_by_doxygen and kodi_package_captures_build_path_in_source_filename_hash. kpcyrd published another monthly report on their work on reproducible builds within the Alpine and Arch Linux distributions, specifically mentioning rebuilderd, one of the components powering reproducible.archlinux.org. The report also touches on binary transparency, an important component for supply chain security. The @GuixHPC account on Twitter posted an infographic on what fraction of GNU Guix packages are bit-for-bit reproducible: Finally, Bernhard M. Wiedemann posted his monthly reproducible builds status report for openSUSE.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Elsewhere, it was discovered that when supporting various new language features and APIs for Android apps, the resulting APK files that are generated now vary wildly from build to build (example diffoscope output). Happily, it appears that a patch has been committed to the relevant source tree. This was also discussed on our mailing list this month in a thread titled Android desugaring and reproducible builds started by Marcus Hoffmann.

Website and documentation There were quite a few changes to the Reproducible Builds website and documentation this month, including:
  • Felix C. Stegerman:
    • Update the website self-build process to not use the buster-backports suite now that Debian Bullseye is the stable release. [ ]
  • Holger Levsen:
    • Add a new page documenting various package rebuilder solutions. [ ]
    • Add some historical talks and slides from DebConf20. [ ][ ]
    • Various improvements to the history page. [ ][ ][ ]
    • Rename the Comparison protocol documentation category to Verification . [ ]
    • Update links to F-Droid documentation. [ ]
  • Ian Muchina:
    • Increase the font size of titles and de-emphasize event details on the talk page. [ ]
    • Rename the README file to README.md to improve the user experience when browsing the Git repository in a web browser. [ ]
  • Mattia Rizzolo:
    • Drop a position:fixed CSS statement that is negatively affecting with some width settings. [ ]
    • Fix the sizing of the elements inside the side navigation bar. [ ]
    • Show gold level sponsors and above in the sidebar. [ ]
    • Updated the documentation within reprotest to mention how ldconfig conflicts with the kernel variation. [ ]
  • Roland Clobus:
    • Added a ticket number for the issue with the live Cinnamon image and diffoscope. [ ]

Testing framework The Reproducible Builds project runs a testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Holger Levsen:
    • Debian-related changes:
      • Make a large number of changes to support the new Debian bookworm release, including adding it to the dashboard [ ], start scheduling tests [ ], adding suitable Apache redirects [ ] etc. [ ][ ][ ][ ][ ]
      • Make the first build use LANG=C.UTF-8 to match the official Debian build servers. [ ]
      • Only test Debian Live images once a week. [ ]
      • Upgrade all nodes to use Debian Bullseye [ ] [ ]
      • Update README documentation for the Debian Bullseye release. [ ]
    • Other changes:
      • Only include rsync output if the $DEBUG variable is enabled. [ ]
      • Don t try to install mock, a tool used to build Fedora packages some time ago. [ ]
      • Drop an unused function. [ ]
      • Various documentation improvements. [ ][ ]
      • Improve the node health check to detect zombie jobs. [ ]
  • Jessica Clarke (FreeBSD-related changes):
    • Update the location and branch name for the main FreeBSD Git repository. [ ]
    • Correctly ignore the source tarball when comparing build results. [ ]
    • Drop an outdated version number from the documentation. [ ]
  • Mattia Rizzolo:
    • Block F-Droid jobs from running whilst the setup is running. [ ]
    • Enable debugging for the rsync job related to Debian Live images. [ ]
    • Pass BUILD_TAG and BUILD_URL environment for the Debian Live jobs. [ ]
    • Refactor the master_wrapper script to use a Bash array for the parameters. [ ]
    • Prefer YAML s safe_load() function over the unsafe variant. [ ]
    • Use the correct variable in the Apache config to match possible existing files on disk. [ ]
    • Stop issuing HTTP 301 redirects for things that not actually permanent. [ ]
  • Roland Clobus (Debian live image generation):
    • Increase the diffoscope timeout from 120 to 240 minutes; the Cinnamon image should now be able to finish. [ ]
    • Use the new snapshot service. [ ]
    • Make a number of improvements to artifact handling, such as moving the artifacts to the Jenkins host [ ] and correctly cleaning them up at the right time. [ ][ ][ ]
    • Where possible, link to the Jenkins build URL that created the artifacts. [ ][ ]
    • Only allow only one job to run at the same time. [ ]
  • Vagrant Cascadian:
    • Temporarily disable armhf nodes for DebConf21. [ ][ ]

Lastly, if you are interested in contributing to the Reproducible Builds project, please visit the Contribute page on our website. You can get in touch with us via:

16 July 2021

Reproducible Builds (diffoscope): diffoscope 178 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 178. This version includes the following changes:
[ Chris Lamb ]
* Don't traceback on an broken symlink in a directory.
  (Closes: reproducible-builds/diffoscope#269)
* Rewrite the calculation of a file's "fuzzy hash" to make the control
  flow cleaner.
[ Balint Reczey ]
* Support .deb package members compressed with the Zstandard algorithm.
  (LP: #1923845)
[ Jean-Romain Garnier ]
* Overhaul the Mach-O executable file comparator.
* Implement tests for the Mach-O comparator.
* Switch to new argument format for the LLVM compiler.
* Fix test_libmix_differences in testsuite for the ELF format.
* Improve macOS compatibility for the Mach-O comparator.
* Add llvm-readobj and llvm-objdump to the internal EXTERNAL_TOOLS data
  structure.
[ Mattia Rizzolo ]
* Invoke gzip(1) with the short option variants to support Busybox's gzip.
  (Closes: reproducible-builds/diffoscope#268)
You find out more by visiting the project homepage.

7 May 2021

Reproducible Builds (diffoscope): diffoscope 174 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 174. This version includes the following changes:
[ Chris Lamb ]
* Check that we are parsing an actual Debian .buildinfo file, not just
  a file with that extension.
  (Closes: #987994, reproducible-builds/diffoscope#254)
* Support signed .buildinfo files again -- file(1) reports them as
  "PGP signed message".
[ Mattia Rizzolo ]
* Make the testsuite pass with file(1) version 5.40.
* Embed some short test fixtures in the test code itself.
* Fix recognition of compressed .xz files with file(1) 5.40.
You find out more by visiting the project homepage.

26 March 2021

Reproducible Builds (diffoscope): diffoscope 171 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 171. This version includes the following changes:
[ Mattia Rizzolo ]
* Do not list as a "skipping reason" tools that do exist.
* Drop the "compose" tool from the list of required tools for these tests,
  since it doesn't seem to be required.
You find out more by visiting the project homepage.

23 January 2021

Reproducible Builds (diffoscope): diffoscope 165 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 165. This version includes the following changes:
[ Dimitrios Apostolou ]
* Introduce the --no-acl and --no-xattr arguments [later collapsed to
  --extended-filesystem-attributes] to improve performance.
* Avoid calling the external stat command.
[ Chris Lamb ]
* Collapse --acl and --xattr into --extended-filesystem-attributes to cover
  all of these extended attributes, defaulting the new option to false (ie.
  to not check these very expensive external calls).
[ Mattia Rizzolo ]
* Override several lintian warnings regarding prebuilt binaries in the
* source.
* Add a pytest.ini file to explicitly use Junit's xunit2 format.
* Ignore the Python DeprecationWarning message regarding the  imp  module
  deprecation as it comes from a third-party library.
* debian/rules: filter the content of the d/*.substvars files
You find out more by visiting the project homepage.

8 January 2021

Reproducible Builds (diffoscope): diffoscope 164 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 164. This version includes the following changes:
[ Chris Lamb ]
* Truncate jsondiff differences at 512 bytes lest they consume the entire page.
* Wrap our external call to cmp(1) with a profile (to match the internal
  profiling).
* Add a note regarding the specific ordering of the new
  all_tools_are_listed test.
[ Dimitrios Apostolou ]
* Performance improvements:
  - Improve speed of has_same_content by spawning cmp(1) less frequently.
  - Log whenever the external cmp(1) command is spawn.ed
  - Avoid invoking external diff for identical, short outputs.
* Rework handling of temporary files:
  - Clean up temporary directories as we go along, instead of at the end.
  - Delete FIFO files when the FIFO feeder's context manager exits.
[ Mattia Rizzolo ]
* Fix a number of potential crashes in --list-debian-substvars, including
  explicitly listing lipo and otool as external tools.
 - Remove redundant code and let object destructors clean up after themselves.
[ Conrad Ratschan ]
* Add a comparator for Flattened Image Trees (FIT) files, a boot image format
  used by U-Boot.
You find out more by visiting the project homepage.

30 November 2020

Chris Lamb: Free software activities in November 2020

Here is my monthly update covering what I have been doing in the free software world during November 2020 (previous month):

Reproducible Builds One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes. The motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. The project is proud to be a member project of the Software Freedom Conservancy. Conservancy acts as a corporate umbrella allowing projects to operate as non-profit initiatives without managing their own corporate structure. If you like the work of the Conservancy or the Reproducible Builds project, please consider becoming an official supporter.
This month, I:
I also made the following changes to diffoscope:

Debian I performed the following uploads to the Debian Linux distribution this month: I also filed a release-critical bug against the minidlna package which could not be successfully purged from the system without reporting a cannot remove '/var/log/minidlna' error. (#975372)

Debian LTS This month I have worked 18 hours on Debian Long Term Support (LTS) and 12 hours on its sister Extended LTS project, including: You can find out more about the Debian LTS project via the following video:

27 November 2020

Reproducible Builds (diffoscope): diffoscope 162 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 162. This version includes the following changes:
[ Chris Lamb ]
* Don't depends on radare2 in the Debian autopkgtests as it will not be in
  bullseye due to security considerations (#950372). (Closes: #975313)
* Avoid "Command  s p a c e d o u t  failed" messages when creating an
  artificial CalledProcessError instance in our generic from_operation
  feeder creator.
* Overhaul long and short descriptions.
* Use the operation's full name so that "command failed" messages include
  its arguments.
* Add a missing comma in a comment.
[ Jelmer Vernoo  ]
* Add missing space to the error message when only one argument is passed to
  diffoscope.
[ Holger Levsen ]
* Update Standards-Version to 4.5.1.
[ Mattia Rizzolo ]
* Split the diffoscope package into a diffoscope-minimal package that
  excludes the larger packages from Recommends. (Closes: #975261)
* Drop support for Python 3.6.
You find out more by visiting the project homepage.

11 November 2020

Reproducible Builds: Reproducible Builds in October 2020

Welcome to the October 2020 report from the Reproducible Builds project. In our monthly reports, we outline the major things that we have been up to over the past month. As a brief reminder, the motivation behind the Reproducible Builds effort is to ensure flaws have not been introduced in the binaries we install on our systems. If you are interested in contributing to the project, please visit our main website.

General On Saturday 10th October, Morten Linderud gave a talk at Arch Conf Online 2020 on The State of Reproducible Builds in Arch. The video should be available later this month, but as a teaser:
The previous year has seen great progress in Arch Linux to get reproducible builds in the hands of the users and developers. In this talk we will explore the current tooling that allows users to reproduce packages, the rebuilder software that has been written to check packages and the current issues in this space.
During the Reproducible Builds summit in Marrakesh in 2019, developers from the GNU Guix, NixOS and Debian distributions were able to produce a bit-for-bit identical GNU Mes binary despite using three different versions of GCC. Since this summit, additional work resulted in a bit-for-bit identical Mes binary using tcc, and last month a fuller update was posted to this effect by the individuals involved. This month, however, David Wheeler updated his extensive page on Fully Countering Trusting Trust through Diverse Double-Compiling, remarking that:
GNU Mes rebuild is definitely an application of [Diverse Double-Compiling]. [..] This is an awesome application of DDC, and I believe it s the first publicly acknowledged use of DDC on a binary
There was a small, followup discussion on our mailing list. In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update. This month, the Reproducible Builds project restarted our IRC meetings, managing to convene twice: the first time on October 12th (summary & logs), and later on the 26th (logs). As mentioned in previous reports, due to the unprecedented events throughout 2020, there will be no in-person summit event this year. On our mailing list this month El as Alejandro posted a request for help with a local configuration

Software development This month, we tried to fix a large number of currently-unreproducible packages, including: Bernhard M. Wiedemann also reported three issues against bison, ibus and postgresql12.

Tools diffoscope is our in-depth and content-aware diff utility. Not only could you locate and diagnose reproducibility issues, it provides human-readable diffs of all kinds too. This month, Chris Lamb uploaded version 161 to Debian (later backported by Mattia Rizzolo), as well as made the following changes:
  • Move test_ocaml to the assert_diff helper. [ ]
  • Update tests to support OCaml version 4.11.1. Thanks to Sebastian Ramacher for the report. (#972518)
  • Bump minimum version of the Black source code formatter to 20.8b1. (#972518)
In addition, Jean-Romain Garnier temporarily updated the dependency on radare2 to ensure our test pipelines continue to work [ ], and for the GNU Guix distribution Vagrant Cascadian diffoscope to version 161 [ ]. In related development, trydiffoscope is the web-based version of diffoscope. This month, Chris Lamb made the following changes:
  • Mark a --help-only test as being a superficial test. (#971506)
  • Add a real, albeit flaky, test that interacts with the try.diffoscope.org service. [ ]
  • Bump debhelper compatibility level to 13 [ ] and bump Standards-Version to 4.5.0 [ ].
Lastly, disorderfs version 0.5.10-2 was uploaded to Debian unstable by Holger Levsen, which enabled security hardening via DEB_BUILD_MAINT_OPTIONS [ ] and dropped debian/disorderfs.lintian-overrides [ ].

Website and documentation This month, a number of updates to the main Reproducible Builds website and related documentation were made by Chris Lamb:
  • Add a citation link to the academic article regarding dettrace [ ], and added yet another supply-chain security attack publication [ ].
  • Reformatted the Jekyll s Liquid templating language and CSS formatting to be consistent [ ] as well as expand a number of tab characters [ ].
  • Used relative_url to fix missing translation icon on various pages. [ ]
  • Published two announcement blog posts regarding the restarting of our IRC meetings. [ ][ ]
  • Added an explicit note regarding the lack of an in-person summit in 2020 to our events page. [ ]

Testing framework The Reproducible Builds project operates a Jenkins-based testing framework that powers tests.reproducible-builds.org. This month, Holger Levsen made the following changes:
  • Debian-related changes:
    • Refactor and improve the Debian dashboard. [ ][ ][ ]
    • Track bugs which are usertagged as filesystem , fixfilepath , etc.. [ ][ ][ ]
    • Make a number of changes to package index pages. [ ][ ][ ]
  • System health checks:
    • Relax disk space warning levels. [ ]
    • Specifically detect build failures reported by dpkg-buildpackage. [ ]
    • Fix a regular expression to detect outdated package sets. [ ]
    • Detect Lintian issues in diffoscope. [ ]
  • Misc:
    • Make a number of updates to reflect that our sponsor Profitbricks has renamed itself to IONOS. [ ][ ][ ][ ]
    • Run a F-Droid maintenance routine twice a month to utilise its cleanup features. [ ]
    • Fix the target name in OpenWrt builds to ath79 from ath97. [ ]
    • Add a missing Postfix configuration for a node. [ ]
    • Temporarily disable Arch Linux builds until a core node is back. [ ]
    • Make a number of changes to our thanks page. [ ][ ][ ]
Build node maintenance was performed by both Holger Levsen [ ][ ] and Vagrant Cascadian [ ][ ][ ], Vagrant Cascadian also updated the page listing the variations made when testing to reflect changes for in build paths [ ] and Hans-Christoph Steiner made a number of changes for F-Droid, the free software app repository for Android devices, including:
  • Do not fail reproducibility jobs when their cleanup tasks fail. [ ]
  • Skip libvirt-related sudo command if we are not actually running libvirt. [ ]
  • Use direct URLs in order to eliminate a useless HTTP redirect. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit the Contribute page on our website. However, you can also get in touch with us via:

5 October 2020

Reproducible Builds: Reproducible Builds in September 2020

Welcome to the September 2020 report from the Reproducible Builds project. In our monthly reports, we attempt to summarise the things that we have been up to over the past month, but if you are interested in contributing to the project, please visit our main website. This month, the Reproducible Builds project was pleased to announce a donation from Amateur Radio Digital Communications (ARDC) in support of its goals. ARDC s contribution will propel the Reproducible Builds project s efforts in ensuring the future health, security and sustainability of our increasingly digital society. Amateur Radio Digital Communications (ARDC) is a non-profit which was formed to further research and experimentation with digital communications using radio, with a goal of advancing the state of the art of amateur radio and to educate radio operators in these techniques. You can view the full announcement as well as more information about ARDC on their website.
In August s report, we announced that Jennifer Helsby (redshiftzero) launched a new reproduciblewheels.com website to address the lack of reproducibility of Python wheels . This month, Kushal Das posted a brief follow-up to provide an update on reproducible sources as well. The Threema privacy and security-oriented messaging application announced that within the next months , their apps will become fully open source, supporting reproducible builds :
This is to say that anyone will be able to independently review Threema s security and verify that the published source code corresponds to the downloaded app.
You can view the full announcement on Threema s website.

Events Sadly, due to the unprecedented events in 2020, there will be no in-person Reproducible Builds event this year. However, the Reproducible Builds project intends to resume meeting regularly on IRC, starting on Monday, October 12th at 18:00 UTC (full announcement). The cadence of these meetings will probably be every two weeks, although this will be discussed and decided on at the first meeting. (An editable agenda is available.) On 18th September, Bernhard M. Wiedemann gave a presentation in German titled Wie reproducible builds Software sicherer machen ( How reproducible builds make software more secure ) at the Internet Security Digital Days 2020 conference. (View video.) On Saturday 10th October, Morten Linderud will give a talk at Arch Conf Online 2020 on The State of Reproducible Builds in the Arch Linux distribution:
The previous year has seen great progress in Arch Linux to get reproducible builds in the hands of the users and developers. In this talk we will explore the current tooling that allows users to reproduce packages, the rebuilder software that has been written to check packages and the current issues in this space.
During the Reproducible Builds summit in Marrakesh, GNU Guix, NixOS and Debian were able to produce a bit-for-bit identical binary when building GNU Mes, despite using three different major versions of GCC. Since the summit, additional work resulted in a bit-for-bit identical Mes binary using tcc and this month, a fuller update was posted by the individuals involved.

Development work In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update.

Debian Chris Lamb uploaded a number of Debian packages to address reproducibility issues that he had previously provided patches for, including cfingerd (#831021), grap (#870573), splint (#924003) & schroot (#902804) Last month, an issue was identified where a large number of Debian .buildinfo build certificates had been tainted on the official Debian build servers, as these environments had files underneath the /usr/local/sbin directory to prevent the execution of system services during package builds. However, this month, Aurelien Jarno and Wouter Verhelst fixed this issue in varying ways, resulting in a special policy-rcd-declarative-deny-all package. Building on Chris Lamb s previous work on reproducible builds for Debian .ISO images, Roland Clobus announced his work in progress on making the Debian Live images reproducible. [ ] Lucas Nussbaum performed an archive-wide rebuild of packages to test enabling the reproducible=+fixfilepath Debian build flag by default. Enabling the fixfilepath feature will likely fix reproducibility issues in an estimated 500-700 packages. The test revealed only 33 packages (out of 30,000 in the archive) that fail to build with fixfilepath. Many of those will be fixed when the default LLVM/Clang version is upgraded. 79 reviews of Debian packages were added, 23 were updated and 17 were removed this month adding to our knowledge about identified issues. Chris Lamb added and categorised a number of new issue types, including packages that captures their build path via quicktest.h and absolute build directories in documentation generated by Doxygen , etc. Lastly, Lukas Puehringer s uploaded a new version of the in-toto to Debian which was sponsored by Holger Levsen. [ ]

diffoscope diffoscope is our in-depth and content-aware diff utility that can not only locate and diagnose reproducibility issues, it provides human-readable diffs of all kinds too. In September, Chris Lamb made the following changes to diffoscope, including preparing and uploading versions 159 and 160 to Debian:
  • New features:
    • Show ordering differences only in strings(1) output by applying the ordering check to all differences across the codebase. [ ]
  • Bug fixes:
    • Mark some PGP tests that they require pgpdump, and check that the associated binary is actually installed before attempting to run it. (#969753)
    • Don t raise exceptions when cleaning up after guestfs cleanup failure. [ ]
    • Ensure we check FALLBACK_FILE_EXTENSION_SUFFIX, otherwise we run pgpdump against all files that are recognised by file(1) as data. [ ]
  • Codebase improvements:
    • Add some documentation for the EXTERNAL_TOOLS dictionary. [ ]
    • Abstract out a variable we use a couple of times. [ ]
  • diffoscope.org website improvements:
    • Make the (long) demonstration GIF less prominent on the page. [ ]
In addition, Paul Spooren added support for automatically deploying Docker images. [ ]

Website and documentation This month, a number of updates to the main Reproducible Builds website and related documentation. Chris Lamb made the following changes: In addition, Holger Levsen re-added the documentation link to the top-level navigation [ ] and documented that the jekyll-polyglot package is required [ ]. Lastly, diffoscope.org and reproducible-builds.org were transferred to Software Freedom Conservancy. Many thanks to Brett Smith from Conservancy, J r my Bobbio (lunar) and Holger Levsen for their help with transferring and to Mattia Rizzolo for initiating this.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of these patches, including: Bernhard M. Wiedemann also reported issues in git2-rs, pyftpdlib, python-nbclient, python-pyzmq & python-sidpy.

Testing framework The Reproducible Builds project operates a Jenkins-based testing framework to power tests.reproducible-builds.org. This month, Holger Levsen made the following changes:
  • Debian:
    • Shorten the subject of nodes have gone offline notification emails. [ ]
    • Also track bugs that have been usertagged with usrmerge. [ ]
    • Drop abort-related codepaths as that functionality has been removed from Jenkins. [ ]
    • Update the frequency we update base images and status pages. [ ][ ][ ][ ]
  • Status summary view page:
    • Add support for monitoring systemctl status [ ] and the number of diffoscope processes [ ].
    • Show the total number of nodes [ ] and colourise critical disk space situations [ ].
    • Improve the visuals with respect to vertical space. [ ][ ]
  • Debian rebuilder prototype:
    • Resume building random packages again [ ] and update the frequency that packages are rebuilt. [ ][ ]
    • Use --no-respect-build-path parameter until sbuild 0.81 is available. [ ]
    • Treat the inability to locate some packages as a debrebuild problem, and not as a issue with the rebuilder itself. [ ]
  • Arch Linux:
    • Update various components to be compatible with Arch Linux s move to the xz compression format. [ ][ ][ ]
    • Allow scheduling of old packages to catch up on the backlog. [ ][ ][ ]
    • Improve formatting on the summary page. [ ][ ]
    • Update HTML pages once every hour, not every 30 minutes. [ ]
    • Use the Ubuntu (!) GPG keyserver to validate packages. [ ]
  • System health checks:
    • Highlight important bad conditions in colour. [ ][ ]
    • Add support for detecting more problems, including Jenkins shutdown issues [ ], failure to upgrade Arch Linux packages [ ], kernels with wrong permissions [ ], etc.
  • Misc:
    • Delete old schroot sessions after 2 days, not 3. [ ]
    • Use sudo to cleanup diffoscope schroot sessions. [ ]
In addition, stefan0xC fixed a query for unknown results in the handling of Arch Linux packages [ ] and Mattia Rizzolo updated the template that notifies maintainers by email of their newly-unreproducible packages to ensure that it did not get caught in junk/spam folders [ ]. Finally, build node maintenance was performed by Holger Levsen [ ][ ][ ][ ], Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

9 September 2020

Reproducible Builds: Reproducible Builds in August 2020

Welcome to the August 2020 report from the Reproducible Builds project. In our monthly reports, we summarise the things that we have been up to over the past month. The motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced from the original free software source code to the pre-compiled binaries we install on our systems. If you re interested in contributing to the project, please visit our main website.


This month, Jennifer Helsby launched a new reproduciblewheels.com website to address the lack of reproducibility of Python wheels. To quote Jennifer s accompanying explanatory blog post:
One hiccup we ve encountered in SecureDrop development is that not all Python wheels can be built reproducibly. We ship multiple (Python) projects in Debian packages, with Python dependencies included in those packages as wheels. In order for our Debian packages to be reproducible, we need that wheel build process to also be reproducible
Parallel to this, transparencylog.com was also launched, a service that verifies the contents of URLs against a publicly recorded cryptographic log. It keeps an append-only log of the cryptographic digests of all URLs it has seen. (GitHub repo) On 18th September, Bernhard M. Wiedemann will give a presentation in German, titled Wie reproducible builds Software sicherer machen ( How reproducible builds make software more secure ) at the Internet Security Digital Days 2020 conference.

Reproducible builds at DebConf20 There were a number of talks at the recent online-only DebConf20 conference on the topic of reproducible builds. Holger gave a talk titled Reproducing Bullseye in practice , focusing on independently verifying that the binaries distributed from ftp.debian.org are made from their claimed sources. It also served as a general update on the status of reproducible builds within Debian. The video (145 MB) and slides are available. There were also a number of other talks that involved Reproducible Builds too. For example, the Malayalam language mini-conference had a talk titled , ? ( I want to join Debian, what should I do? ) presented by Praveen Arimbrathodiyil, the Clojure Packaging Team BoF session led by Elana Hashman, as well as Where is Salsa CI right now? that was on the topic of Salsa, the collaborative development server that Debian uses to provide the necessary tools for package maintainers, packaging teams and so on. Jonathan Bustillos (Jathan) also gave a talk in Spanish titled Un camino verificable desde el origen hasta el binario ( A verifiable path from source to binary ). (Video, 88MB)

Development work After many years of development work, the compiler for the Rust programming language now generates reproducible binary code. This generated some general discussion on Reddit on the topic of reproducibility in general. Paul Spooren posted a request for comments to OpenWrt s openwrt-devel mailing list asking for clarification on when to raise the PKG_RELEASE identifier of a package. This is needed in order to successfully perform rebuilds in a reproducible builds context. In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update. Chris Lamb provided some comments and pointers on an upstream issue regarding the reproducibility of a Snap / SquashFS archive file. [ ]

Debian Holger Levsen identified that a large number of Debian .buildinfo build certificates have been tainted on the official Debian build servers, as these environments have files underneath the /usr/local/sbin directory [ ]. He also filed against bug for debrebuild after spotting that it can fail to download packages from snapshot.debian.org [ ]. This month, several issues were uncovered (or assisted) due to the efforts of reproducible builds. For instance, Debian bug #968710 was filed by Simon McVittie, which describes a problem with detached debug symbol files (required to generate a traceback) that is unlikely to have been discovered without reproducible builds. In addition, Jelmer Vernooij called attention that the new Debian Janitor tool is using the property of reproducibility (as well as diffoscope when applying archive-wide changes to Debian:
New merge proposals also include a link to the diffoscope diff between a vanilla build and the build with changes. Unfortunately these can be a bit noisy for packages that are not reproducible yet, due to the difference in build environment between the two builds. [ ]
56 reviews of Debian packages were added, 38 were updated and 24 were removed this month adding to our knowledge about identified issues. Specifically, Chris Lamb added and categorised the nondeterministic_version_generated_by_python_param and the lessc_nondeterministic_keys toolchain issues. [ ][ ] Holger Levsen sponsored Lukas Puehringer s upload of the python-securesystemslib pacage, which is a dependency of in-toto, a framework to secure the integrity of software supply chains. [ ] Lastly, Chris Lamb further refined his merge request against the debian-installer component to allow all arguments from sources.list files (such as [check-valid-until=no]) in order that we can test the reproducibility of the installer images on the Reproducible Builds own testing infrastructure and sent a ping to the team that maintains that code.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of these patches, including:

diffoscope diffoscope is our in-depth and content-aware diff utility that can not only locate and diagnose reproducibility issues, it provides human-readable diffs of all kinds. In August, Chris Lamb made the following changes to diffoscope, including preparing and uploading versions 155, 156, 157 and 158 to Debian:
  • New features:
    • Support extracting data of PGP signed data. (#214)
    • Try files named .pgp against pgpdump(1) to determine whether they are Pretty Good Privacy (PGP) files. (#211)
    • Support multiple options for all file extension matching. [ ]
  • Bug fixes:
    • Don t raise an exception when we encounter XML files with <!ENTITY> declarations inside the Document Type Definition (DTD), or when a DTD or entity references an external resource. (#212)
    • pgpdump(1) can successfully parse some binary files, so check that the parsed output contains something sensible before accepting it. [ ]
    • Temporarily drop gnumeric from the Debian build-dependencies as it has been removed from the testing distribution. (#968742)
    • Correctly use fallback_recognises to prevent matching .xsb binary XML files.
    • Correct identify signed PGP files as file(1) returns data . (#211)
  • Logging improvements:
    • Emit a message when ppudump version does not match our file header. [ ]
    • Don t use Python s repr(object) output in Calling external command messages. [ ]
    • Include the filename in the not identified by any comparator message. [ ]
  • Codebase improvements:
    • Bump Python requirement from 3.6 to 3.7. Most distributions are either shipping with Python 3.5 or 3.7, so supporting 3.6 is not only somewhat unnecessary but also cumbersome to test locally. [ ]
    • Drop some unused imports [ ], drop an unnecessary dictionary comprehensions [ ] and some unnecessary control flow [ ].
    • Correct typo of output in a comment. [ ]
  • Release process:
    • Move generation of debian/tests/control to an external script. [ ]
    • Add some URLs for the site that will appear on PyPI.org. [ ]
    • Update author and author email in setup.py for PyPI.org and similar. [ ]
  • Testsuite improvements:
    • Update PPU tests for compatibility with Free Pascal versions 3.2.0 or greater. (#968124)
    • Mark that our identification test for .ppu files requires ppudump version 3.2.0 or higher. [ ]
    • Add an assert_diff helper that loads and compares a fixture output. [ ][ ][ ][ ]
  • Misc:
In addition, Mattia Rizzolo documented in setup.py that diffoscope works with Python version 3.8 [ ] and Frazer Clews applied some Pylint suggestions [ ] and removed some deprecated methods [ ].

Website This month, Chris Lamb updated the main Reproducible Builds website and documentation to:
  • Clarify & fix a few entries on the who page [ ][ ] and ensure that images do not get to large on some viewports [ ].
  • Clarify use of a pronoun re. Conservancy. [ ]
  • Use View all our monthly reports over View all monthly reports . [ ]
  • Move a is a suffix out of the link target on the SOURCE_DATE_EPOCH age. [ ]
In addition, Javier Jard n added the freedesktop-sdk project [ ] and Kushal Das added SecureDrop project [ ] to our projects page. Lastly, Michael P hn added internationalisation and translation support with help from Hans-Christoph Steiner [ ].

Testing framework The Reproducible Builds project operate a Jenkins-based testing framework to power tests.reproducible-builds.org. This month, Holger Levsen made the following changes:
  • System health checks:
    • Improve explanation how the status and scores are calculated. [ ][ ]
    • Update and condense view of detected issues. [ ][ ]
    • Query the canonical configuration file to determine whether a job is disabled instead of duplicating/hardcoding this. [ ]
    • Detect several problems when updating the status of reporting-oriented metapackage sets. [ ]
    • Detect when diffoscope is not installable [ ] and failures in DNS resolution [ ].
  • Debian:
    • Update the URL to the Debian security team bug tracker s Git repository. [ ]
    • Reschedule the unstable and bullseye distributions often for the arm64 architecture. [ ]
    • Schedule buster less often for armhf. [ ][ ][ ]
    • Force the build of certain packages in the work-in-progress package rebuilder. [ ][ ]
    • Only update the stretch and buster base build images when necessary. [ ]
  • Other distributions:
    • For F-Droid, trigger jobs by commits, not by a timer. [ ]
    • Disable the Archlinux HTML page generation job as it has never worked. [ ]
    • Disable the alternative OpenWrt rebuilder jobs. [ ]
  • Misc;
Many other changes were made too, including:
  • Chris Lamb:
    • Use <pre> HTML tags when dumping fixed-width debugging data in the self-serve package scheduler. [ ]
  • Mattia Rizzolo:
  • Vagrant Cascadian:
    • Mark that the u-boot Universal Boot Loader should not build architecture independent packages on the arm64 architecture anymore. [ ]
Finally, build node maintenance was performed by Holger Levsen [ ], Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ][ ]

Mailing list On our mailing list this month, Leo Wandersleb sent a message to the list after he was wondering how to expand his WalletScrutiny.com project (which aims to improve the security of Bitcoin wallets) from Android wallets to also monitor Linux wallets as well:
If you think you know how to spread the word about reproducibility in the context of Bitcoin wallets through WalletScrutiny, your contributions are highly welcome on this PR [ ]
Julien Lepiller posted to the list linking to a blog post by Tavis Ormandy titled You don t need reproducible builds. Morten Linderud (foxboron) responded with a clear rebuttal that Tavis was only considering the narrow use-case of proprietary vendors and closed-source software. He additionally noted that the criticism that reproducible builds cannot prevent against backdoors being deliberately introduced into the upstream source ( bugdoors ) are decidedly (and deliberately) outside the scope of reproducible builds to begin with. Chris Lamb included the Reproducible Builds mailing list in a wider discussion regarding a tentative proposal to include .buildinfo files in .deb packages, adding his remarks regarding requiring a custom tool in order to determine whether generated build artifacts are identical in a reproducible context. [ ] Jonathan Bustillos (Jathan) posted a quick email to the list requesting whether there was a list of To do tasks in Reproducible Builds. Lastly, Chris Lamb responded at length to a query regarding the status of reproducible builds for Debian ISO or installation images. He noted that most of the technical work has been performed but there are at least four issues until they can be generally advertised as such . He pointed that the privacy-oriented Tails operation system, which is based directly on Debian, has had reproducible builds for a number of years now. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

8 August 2020

Reproducible Builds: Reproducible Builds in July 2020

Welcome to the July 2020 report from the Reproducible Builds project. In these monthly reports, we round-up the things that we have been up to over the past month. As a brief refresher, the motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced from the original free software source code to the pre-compiled binaries we install on our systems. (If you re interested in contributing to the project, please visit our main website.)

General news At the upcoming DebConf20 conference (now being held online), Holger Levsen will present a talk on Thursday 27th August about Reproducing Bullseye in practice , focusing on independently verifying that the binaries distributed from ftp.debian.org were made from their claimed sources. Tavis Ormandy published a blog post making the provocative claim that You don t need reproducible builds , asserting elsewhere that the many attacks that have been extensively reported in our previous reports are fantasy threat models . A number of rebuttals have been made, including one from long-time contributor Reproducible Builds contributor Bernhard Wiedemann. On our mailing list this month, Debian Developer Graham Inggs posted to our list asking for ideas why the openorienteering-mapper Debian package was failing to build on the Reproducible Builds testing framework. Chris Lamb remarked from the build logs that the package may be missing a build dependency, although Graham then used our own diffoscope tool to show that the resulting package remains unchanged with or without it. Later, Nico Tyni noticed that the build failure may be due to the relationship between the FILE C preprocessor macro and the -ffile-prefix-map GCC flag. An issue in Zephyr, a small-footprint kernel designed for use on resource-constrained systems, around .a library files not being reproducible was closed after it was noticed that a key part of their toolchain was updated that now calls --enable-deterministic-archives by default. Reproducible Builds developer kpcyrd commented on a pull request against the libsodium cryptographic library wrapper for Rust, arguing against the testing of CPU features at compile-time. He noted that:
I ve accidentally shipped broken updates to users in the past because the build system was feature-tested and the final binary assumed the instructions would be present without further runtime checks
David Kleuker also asked a question on our mailing list about using SOURCE_DATE_EPOCH with the install(1) tool from GNU coreutils. When comparing two installed packages he noticed that the filesystem birth times differed between them. Chris Lamb replied, realising that this was actually a consequence of using an outdated version of diffoscope and that a fix was in diffoscope version 146 released in May 2020. Later in July, John Scott posted asking for clarification regarding on the Javascript files on our website to add metadata for LibreJS, the browser extension that blocks non-free Javascript scripts from executing. Chris Lamb investigated the issue and realised that we could drop a number of unused Javascript files [ ][ ][ ] and added unminified versions of Bootstrap and jQuery [ ].

Development work

Website On our website this month, Chris Lamb updated the main Reproducible Builds website and documentation to drop a number of unused Javascript files [ ][ ][ ] and added unminified versions of Bootstrap and jQuery [ ]. He also fixed a number of broken URLs [ ][ ]. Gonzalo Bulnes Guilpain made a large number of grammatical improvements [ ][ ][ ][ ][ ] as well as some misspellings, case and whitespace changes too [ ][ ][ ]. Lastly, Holger Levsen updated the README file [ ], marked the Alpine Linux continuous integration tests as currently disabled [ ] and linked the Arch Linux Reproducible Status page from our projects page [ ].

diffoscope diffoscope is our in-depth and content-aware diff utility that can not only locate and diagnose reproducibility issues, it provides human-readable diffs of all kinds. In July, Chris Lamb made the following changes to diffoscope, including releasing versions 150, 151, 152, 153 & 154:
  • New features:
    • Add support for flash-optimised F2FS filesystems. (#207)
    • Don t require zipnote(1) to determine differences in a .zip file as we can use libarchive. [ ]
    • Allow --profile as a synonym for --profile=-, ie. write profiling data to standard output. [ ]
    • Increase the minimum length of the output of strings(1) to eight characters to avoid unnecessary diff noise. [ ]
    • Drop some legacy argument styles: --exclude-directory-metadata and --no-exclude-directory-metadata have been replaced with --exclude-directory-metadata= yes,no . [ ]
  • Bug fixes:
    • Pass the absolute path when extracting members from SquashFS images as we run the command with working directory in a temporary directory. (#189)
    • Correct adding a comment when we cannot extract a filesystem due to missing libguestfs module. [ ]
    • Don t crash when listing entries in archives if they don t have a listed size such as hardlinks in ISO images. (#188)
  • Output improvements:
    • Strip off the file offset prefix from xxd(1) and show bytes in groups of 4. [ ]
    • Don t emit javap not found in path if it is available in the path but it did not result in an actual difference. [ ]
    • Fix ... not available in path messages when looking for Java decompilers that used the Python class name instead of the command. [ ]
  • Logging improvements:
    • Add a bit more debugging info when launching libguestfs. [ ]
    • Reduce the --debug log noise by truncating the has_some_content messages. [ ]
    • Fix the compare_files log message when the file does not have a literal name. [ ]
  • Codebase improvements:
    • Rewrite and rename exit_if_paths_do_not_exist to not check files multiple times. [ ][ ]
    • Add an add_comment helper method; don t mess with our internal list directly. [ ]
    • Replace some simple usages of str.format with Python f-strings [ ] and make it easier to navigate to the main.py entry point [ ].
    • In the RData comparator, always explicitly return None in the failure case as we return a non-None value in the success one. [ ]
    • Tidy some imports [ ][ ][ ] and don t alias a variable when we do not use it. [ ]
    • Clarify the use of a separate NullChanges quasi-file to represent missing data in the Debian package comparator [ ] and clarify use of a null diff in order to remember an exit code. [ ]
  • Other changes:
    • Profile the launch of libguestfs filesystems. [ ]
    • Clarify and correct our contributing info. [ ][ ][ ][ ][ ][ ]
Jean-Romain Garnier also made the following changes:
  • Allow passing a file with a list of arguments via diffoscope @args.txt. (!62)
  • Improve the output of side-by-side diffs by detecting added lines better. (!64)
  • Remove offsets before instructions in objdump [ ][ ] and remove raw instructions from ELF tests [ ].

Other tools strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. It is used automatically in most Debian package builds. In July, Chris Lamb ensured that we did not install the internal handler documentation generated from Perl POD documents [ ] and fixed a trivial typo [ ]. Marc Herbert added a --verbose-level warning when the Archive::Cpio Perl module is missing. (!6) reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes to support diffoscope version 153 which had removed the (deprecated) --exclude-directory-metadata and --no-exclude-directory-metadata command-line arguments, and updated the testing configuration to also test under Python version 3.8 [ ].

Distributions

Debian In June 2020, Timo R hling filed a wishlist bug against the debhelper build tool impacting the reproducibility status of hundreds of packages that use the CMake build system. This month however, Niels Thykier uploaded debhelper version 13.2 that passes the -DCMAKE_SKIP_RPATH=ON and -DBUILD_RPATH_USE_ORIGIN=ON arguments to CMake when using the (currently-experimental) Debhelper compatibility level 14. According to Niels, this change:
should fix some reproducibility issues, but may cause breakage if packages run binaries directly from the build directory.
34 reviews of Debian packages were added, 14 were updated and 20 were removed this month adding to our knowledge about identified issues. Chris Lamb added and categorised the nondeterministic_order_of_debhelper_snippets_added_by_dh_fortran_mod [ ] and gem2deb_install_mkmf_log [ ] toolchain issues. Lastly, Holger Levsen filed two more wishlist bugs against the debrebuild Debian package rebuilder tool [ ][ ].

openSUSE In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update. Bernhard also published the results of performing 12,235 verification builds of packages from openSUSE Leap version 15.2 and, as a result, created three pull requests against the openSUSE Build Result Compare Script [ ][ ][ ].

Other distributions In Arch Linux, there was a mass rebuild of old packages in an attempt to make them reproducible. This was performed because building with a previous release of the pacman package manager caused file ordering and size calculation issues when using the btrfs filesystem. A system was also implemented for Arch Linux packagers to receive notifications if/when their package becomes unreproducible, and packagers now have access to a dashboard where they can all see all their unreproducible packages (more info). Paul Spooren sent two versions of a patch for the OpenWrt embedded distribution for adding a build system revision to the packages manifest so that all external feeds can be rebuilt and verified. [ ][ ]

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of these patches, including: Vagrant Cascadian also reported two issues, the first regarding a regression in u-boot boot loader reproducibility for a particular target [ ] and a non-deterministic segmentation fault in the guile-ssh test suite [ ]. Lastly, Jelle van der Waa filed a bug against the MeiliSearch search API to report that it embeds the current build date.

Testing framework We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org. This month, Holger Levsen made the following changes:
  • Debian-related changes:
    • Tweak the rescheduling of various architecture and suite combinations. [ ][ ]
    • Fix links for 404 and not for us icons. (#959363)
    • Further work on a rebuilder prototype, for example correctly processing the sbuild exit code. [ ][ ]
    • Update the sudo configuration file to allow the node health job to work correctly. [ ]
    • Add php-horde packages back to the pkg-php-pear package set for the bullseye distribution. [ ]
    • Update the version of debrebuild. [ ]
  • System health check development:
    • Add checks for broken SSH [ ], logrotate [ ], pbuilder [ ], NetBSD [ ], unkillable processes [ ], unresponsive nodes [ ][ ][ ][ ], proxy connection failures [ ], too many installed kernels [ ], etc.
    • Automatically fix some failed systemd units. [ ]
    • Add notes explaining all the issues that hosts are experiencing [ ] and handle zipped job log files correctly [ ].
    • Separate nodes which have been automatically marked as down [ ] and show status icons for jobs with issues [ ].
  • Misc:
    • Disable all Alpine Linux jobs until they are or Alpine is fixed. [ ]
    • Perform some general upkeep of build nodes hosted by OSUOSL. [ ][ ][ ][ ]
In addition, Mattia Rizzolo updated the init_node script to suggest using sudo instead of explicit logout and logins [ ][ ] and the usual build node maintenance was performed by Holger Levsen [ ][ ][ ][ ][ ][ ], Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ][ ].

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

7 August 2020

Reproducible Builds (diffoscope): diffoscope 155 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 155. This version includes the following changes:
[ Chris Lamb ]
* Bump Python requirement from 3.6 to 3.7 - most distributions are either
  shipping3.5 or 3.7, so supporting 3.6 is not somewhat unnecessary and also
  more difficult to test locally.
* Improvements to setup.py:
  - Apply the Black source code reformatter.
  - Add some URLs for the site of PyPI.org.
  - Update "author" and author email.
* Explicitly support Python 3.8.
[ Frazer Clews ]
* Move away from the deprecated logger.warn method logger.warning.
[ Mattia Rizzolo ]
* Document ("classify") on PyPI that this project works with Python 3.8.
You find out more by visiting the project homepage.

6 July 2020

Reproducible Builds: Reproducible Builds in June 2020

Welcome to the June 2020 report from the Reproducible Builds project. In these reports we outline the most important things that we and the rest of the community have been up to over the past month.

What are reproducible builds? One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. But whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into seemingly secure software during the various compilation and distribution processes.

News The GitHub Security Lab published a long article on the discovery of a piece of malware designed to backdoor open source projects that used the build process and its resulting artifacts to spread itself. In the course of their analysis and investigation, the GitHub team uncovered 26 open source projects that were backdoored by this malware and were actively serving malicious code. (Full article) Carl Dong from Chaincode Labs uploaded a presentation on Bitcoin Build System Security and reproducible builds to YouTube: The app intended to trace infection chains of Covid-19 in Switzerland published information on how to perform a reproducible build. The Reproducible Builds project has received funding in the past from the Open Technology Fund (OTF) to reach specific technical goals, as well as to enable the project to meet in-person at our summits. The OTF has actually also assisted countless other organisations that promote transparent, civil society as well as those that provide tools to circumvent censorship and repressive surveillance. However, the OTF has now been threatened with closure. (More info) It was noticed that Reproducible Builds was mentioned in the book End-user Computer Security by Mark Fernandes (published by WikiBooks) in the section titled Detection of malware in software. Lastly, reproducible builds and other ideas around software supply chain were mentioned in a recent episode of the Ubuntu Podcast in a wider discussion about the Snap and application stores (at approx 16:00).

Distribution work In the ArchLinux distribution, a goal to remove .doctrees from installed files was created via Arch s TODO list mechanism. These .doctree files are caches generated by the Sphinx documentation generator when developing documentation so that Sphinx does not have to reparse all input files across runs. They should not be packaged, especially as they lead to the package being unreproducible as their pickled format contains unreproducible data. Jelle van der Waa and Eli Schwartz submitted various upstream patches to fix projects that install these by default. Dimitry Andric was able to determine why the reproducibility status of FreeBSD s base.txz depended on the number of CPU cores, attributing it to an optimisation made to the Clang C compiler [ ]. After further detailed discussion on the FreeBSD bug it was possible to get the binaries reproducible again [ ]. For the GNU Guix operating system, Vagrant Cascadian started a thread about collecting reproducibility metrics and Jan janneke Nieuwenhuizen posted that they had further reduced their bootstrap seed to 25% which is intended to reduce the amount of code to be audited to avoid potential compiler backdoors. In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update as well as made the following changes within the distribution itself:

Debian Holger Levsen filed three bugs (#961857, #961858 & #961859) against the reproducible-check tool that reports on the reproducible status of installed packages on a running Debian system. They were subsequently all fixed by Chris Lamb [ ][ ][ ]. Timo R hling filed a wishlist bug against the debhelper build tool impacting the reproducibility status of 100s of packages that use the CMake build system which led to a number of tests and next steps. [ ] Chris Lamb contributed to a conversation regarding the nondeterministic execution of order of Debian maintainer scripts that results in the arbitrary allocation of UNIX group IDs, referencing the Tails operating system s approach this [ ]. Vagrant Cascadian also added to a discussion regarding verification formats for reproducible builds. 47 reviews of Debian packages were added, 37 were updated and 69 were removed this month adding to our knowledge about identified issues. Chris Lamb identified and classified a new uids_gids_in_tarballs_generated_by_cmake_kde_package_app_templates issue [ ] and updated the paths_vary_due_to_usrmerge as deterministic issue, and Vagrant Cascadian updated the cmake_rpath_contains_build_path and gcc_captures_build_path issues. [ ][ ][ ]. Lastly, Debian Developer Bill Allombert started a mailing list thread regarding setting the -fdebug-prefix-map command-line argument via an environment variable and Holger Levsen also filed three bugs against the debrebuild Debian package rebuilder tool (#961861, #961862 & #961864).

Development On our website this month, Arnout Engelen added a link to our Mastodon account [ ] and moved the SOURCE_DATE_EPOCH git log example to another section [ ]. Chris Lamb also limited the number of news posts to avoid showing items from (for example) 2017 [ ]. strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. It is used automatically in most Debian package builds. This month, Mattia Rizzolo bumped the debhelper compatibility level to 13 [ ] and adjusted a related dependency to avoid potential circular dependency [ ].

Upstream work The Reproducible Builds project attempts to fix unreproducible packages and we try to to send all of our patches upstream. This month, we wrote a large number of such patches including: Bernhard M. Wiedemann also filed reports for frr (build fails on single-processor machines), ghc-yesod-static/git-annex (a filesystem ordering issue) and ooRexx (ASLR-related issue).

diffoscope diffoscope is our in-depth diff-on-steroids utility which helps us diagnose reproducibility issues in packages. It does not define reproducibility, but rather provides a helpful and human-readable guidance for packages that are not reproducible, rather than relying essentially-useless binary diffs. This month, Chris Lamb uploaded versions 147, 148 and 149 to Debian and made the following changes:
  • New features:
    • Add output from strings(1) to ELF binaries. (#148)
    • Dump PE32+ executables (such as EFI applications) using objdump(1). (#181)
    • Add support for Zsh shell completion. (#158)
  • Bug fixes:
    • Prevent a traceback when comparing PDF documents that did not contain metadata (ie. a PDF /Info stanza). (#150)
    • Fix compatibility with jsondiff version 1.2.0. (#159)
    • Fix an issue in GnuPG keybox file handling that left filenames in the diff. [ ]
    • Correct detection of JSON files due to missing call to File.recognizes that checks candidates against file(1). [ ]
  • Output improvements:
    • Use the CSS word-break property over manually adding U+200B zero-width spaces as these were making copy-pasting cumbersome. (!53)
    • Downgrade the tlsh warning message to an info level warning. (#29)
  • Logging improvements:
  • Testsuite improvements:
    • Update tests for file(1) version 5.39. (#179)
    • Drop accidentally-duplicated copy of the --diff-mask tests. [ ]
    • Don t mask an existing test. [ ]
  • Codebase improvements:
    • Replace obscure references to WF with Wagner-Fischer for clarity. [ ]
    • Use a semantic AbstractMissingType type instead of remembering to check for both types of missing files. [ ]
    • Add a comment regarding potential security issue in the .changes, .dsc and .buildinfo comparators. [ ]
    • Drop a large number of unused imports. [ ][ ][ ][ ][ ]
    • Make many code sections more Pythonic. [ ][ ][ ][ ]
    • Prevent some variable aliasing issues. [ ][ ][ ]
    • Use some tactical f-strings to tidy up code [ ][ ] and remove explicit u"unicode" strings [ ].
    • Refactor a large number of routines for clarity. [ ][ ][ ][ ]
trydiffoscope is the web-based version of diffoscope. This month, Chris Lamb also corrected the location for the celerybeat scheduler to ensure that the clean/tidy tasks are actually called which had caused an accidental resource exhaustion. (#12) In addition Jean-Romain Garnier made the following changes:
  • Fix the --new-file option when comparing directories by merging DirectoryContainer.compare and Container.compare. (#180)
  • Allow user to mask/filter diff output via --diff-mask=REGEX. (!51)
  • Make child pages open in new window in the --html-dir presenter format. [ ]
  • Improve the diffs in the --html-dir format. [ ][ ]
Lastly, Daniel Fullmer fixed the Coreboot filesystem comparator [ ] and Mattia Rizzolo prevented warnings from the tlsh fuzzy-matching library during tests [ ] and tweaked the build system to remove an unwanted .build directory [ ]. For the GNU Guix distribution Vagrant Cascadian updated the version of diffoscope to version 147 [ ] and later 148 [ ].

Testing framework We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org. Amongst many other tasks, this tracks the status of our reproducibility efforts across many distributions as well as identifies any regressions that have been introduced. This month, Holger Levsen made the following changes:
  • Debian-related changes:
    • Prevent bogus failure emails from rsync2buildinfos.debian.net every night. [ ]
    • Merge a fix from David Bremner s database of .buildinfo files to include a fix regarding comparing source vs. binary package versions. [ ]
    • Only run the Debian package rebuilder job twice per day. [ ]
    • Increase bullseye scheduling. [ ]
  • System health status page:
    • Add a note displaying whether a node needs to be rebooted for a kernel upgrade. [ ]
    • Fix sorting order of failed jobs. [ ]
    • Expand footer to link to the related Jenkins job. [ ]
    • Add archlinux_html_pages, openwrt_rebuilder_today and openwrt_rebuilder_future to known broken jobs. [ ]
    • Add HTML <meta> header to refresh the page every 5 minutes. [ ]
    • Count the number of ignored jobs [ ], ignore permanently known broken jobs [ ] and jobs on known offline nodes [ ].
    • Only consider the known offline status from Git. [ ]
    • Various output improvements. [ ][ ]
  • Tools:
    • Switch URLs for the Grml Live Linux and PureOS package sets. [ ][ ]
    • Don t try to build a disorderfs Debian source package. [ ][ ][ ]
    • Stop building diffoscope as we are moving this to Salsa. [ ][ ]
    • Merge several is diffoscope up-to-date on every platform? test jobs into one [ ] and fail less noisily if the version in Debian cannot be determined [ ].
In addition: Marcus Hoffmann was added as a maintainer of the F-Droid reproducible checking components [ ], Jelle van der Waa updated the is diffoscope up-to-date in every platform check for Arch Linux and diffoscope [ ], Mattia Rizzolo backed up a copy of a remove script run on the Codethink-hosted jump server [ ] and Vagrant Cascadian temporarily disabled the fixfilepath on bullseye, to get better data about the ftbfs_due_to_f-file-prefix-map categorised issue. Lastly, the usual build node maintenance was performed by Holger Levsen [ ][ ], Mattia Rizzolo [ ] and Vagrant Cascadian [ ][ ][ ][ ][ ].

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

This month s report was written by Bernhard M. Wiedemann, Chris Lamb, Eli Schwartz, Holger Levsen, Jelle van der Waa and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

26 June 2020

Reproducible Builds (diffoscope): diffoscope 149 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 149. This version includes the following changes:
[ Chris Lamb ]
* Update tests for file 5.39. (Closes: reproducible-builds/diffoscope#179)
* Downgrade the tlsh warning message to an "info" level warning.
  (Closes: #888237, reproducible-builds/diffoscope#29)
* Use the CSS "word-break" property over manually adding U+200B zero-width
  spaces that make copy-pasting cumbersome.
  (Closes: reproducible-builds/diffoscope!53)
* Codebase improvements:
  - Drop some unused imports from the previous commit.
  - Prevent an unnecessary .format() when rendering difference comments.
  - Use a semantic "AbstractMissingType" type instead of remembering to check
    for both "missing" files and missing containers.
[ Jean-Romain Garnier ]
* Allow user to mask/filter reader output via --diff-mask=REGEX.
  (MR: reproducible-builds/diffoscope!51)
* Make --html-dir child pages open in new window to accommodate new web
  browser content security policies.
* Fix the --new-file option when comparing directories by merging
  DirectoryContainer.compare and Container.compare.
  (Closes: reproducible-builds/diffoscope#180)
* Fix zsh completion for --max-page-diff-block-lines.
[ Mattia Rizzolo ]
* Do not warn about missing tlsh during tests.
You find out more by visiting the project homepage.

4 June 2020

Reproducible Builds: Reproducible Builds in May 2020

Welcome to the May 2020 report from the Reproducible Builds project. One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. Nonetheless, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into seemingly secure software during the various compilation and distribution processes. In these reports we outline the most important things that we and the rest of the community have been up to over the past month.

News The Corona-Warn app that helps trace infection chains of SARS-CoV-2/COVID-19 in Germany had a feature request filed against it that it build reproducibly. A number of academics from Cornell University have published a paper titled Backstabber s Knife Collection which reviews various open source software supply chain attacks:
Recent years saw a number of supply chain attacks that leverage the increasing use of open source during software development, which is facilitated by dependency managers that automatically resolve, download and install hundreds of open source packages throughout the software life cycle.
In related news, the LineageOS Android distribution announced that a hacker had access to the infrastructure of their servers after exploiting an unpatched vulnerability. Marcin Jachymiak of the Sia decentralised cloud storage platform posted on their blog that their siac and siad utilities can now be built reproducibly:
This means that anyone can recreate the same binaries produced from our official release process. Now anyone can verify that the release binaries were created using the source code we say they were created from. No single person or computer needs to be trusted when producing the binaries now, which greatly reduces the attack surface for Sia users.
Synchronicity is a distributed build system for Rust build artifacts which have been published to crates.io. The goal of Synchronicity is to provide a distributed binary transparency system which is independent of any central operator. The Comparison of Linux distributions article on Wikipedia now features a Reproducible Builds column indicating whether distributions approach and progress towards achieving reproducible builds.

Distribution work In Debian this month: In Alpine Linux, an issue was filed and closed regarding the reproducibility of .apk packages. Allan McRae of the ArchLinux project posted their third Reproducible builds progress report to the arch-dev-public mailing list which includes the following call for help:
We also need help to investigate and fix the packages that fail to reproduce that we have not investigated as of yet.
In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update.

Software development

diffoscope Chris Lamb made the changes listed below to diffoscope, our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. He also prepared and uploaded versions 142, 143, 144, 145 and 146 to Debian, PyPI, etc.
  • Comparison improvements:
    • Improve fuzzy matching of JSON files as file now supports recognising JSON data. (#106)
    • Refactor .changes and .buildinfo handling to show all details (including the GnuPG header and footer components) even when referenced files are not present. (#122)
    • Use our BuildinfoFile comparator (etc.) regardless of whether the associated files (such as the orig.tar.gz and the .deb) are present. [ ]
    • Include GnuPG signature data when comparing .buildinfo, .changes, etc. [ ]
    • Add support for printing Android APK signatures via apksigner(1). (#121)
    • Identify iOS App Zip archive data as .zip files. (#116)
    • Add support for Apple Xcode .mobilepovision files. (#113)
  • Bug fixes:
    • Don t print a traceback if we pass a single, missing argument to diffoscope (eg. a JSON diff to re-load). [ ]
    • Correct differences typo in the ApkFile handler. (#127)
  • Output improvements:
    • Never emit the same id="foo" anchor reference twice in the HTML output, otherwise identically-named parts will not be able to linked to via a #foo anchor. (#120)
    • Never emit an empty id anchor either; it is not possible to link to #. [ ]
    • Don t pretty-print the output when using the --json presenter; it will usually be too complicated to be readable by the human anyway. [ ]
    • Use the SHA256 over MD5 hash when generating page names for the HTML directory-style presenter. (#124)
  • Reporting improvements:
    • Clarify the message when we truncate the number of lines to standard error [ ] and reduce the number of maximum lines printed to 25 as usually the error is obvious by then [ ].
    • Print the amount of free space that we have available in our temporary directory as a debugging message. [ ]
    • Clarify Command [ ] failed with exit code messages to remove duplicate exited with exit but also to note that diffoscope is interpreting this as an error. [ ]
    • Don t leak the full path of the temporary directory in Command [ ] exited with 1 messages. (#126)
    • Clarify the warning message when we cannot import the debian Python module. [ ]
    • Don t repeat stderr from if both commands emit the same output. [ ]
    • Clarify that an external command emits for both files, otherwise it can look like we are repeating itself when, in reality, it is being run twice. [ ]
  • Testsuite improvements:
    • Prevent apksigner test failures due to lack of binfmt_misc, eg. on Salsa CI and elsewhere. [ ]
    • Drop .travis.yml as we use Salsa instead. [ ]
  • Dockerfile improvements:
    • Add a .dockerignore file to whitelist files we actually need in our container. (#105)
    • Use ARG instead of ENV when setting up the DEBIAN_FRONTEND environment variable at runtime. (#103)
    • Run as a non-root user in container. (#102)
    • Install/remove the build-essential during build so we can install the recommended packages from Git. [ ]
  • Codebase improvements:
    • Bump the officially required version of Python from 3.5 to 3.6. (#117)
    • Drop the (default) shell=False keyword argument to subprocess.Popen so that the potentially-unsafe shell=True is more obvious. [ ]
    • Perform string normalisation in Black [ ] and include the Black output in the assertion failure too [ ].
    • Inline MissingFile s special handling of deb822 to prevent leaking through abstract layers. [ ][ ]
    • Allow a bare try/except block when cleaning up temporary files with respect to the flake8 quality assurance tool. [ ]
    • Rename in_dsc_path to dsc_in_same_dir to clarify the use of this variable. [ ]
    • Abstract out the duplicated parts of the debian_fallback class [ ] and add descriptions for the file types. [ ]
    • Various commenting and internal documentation improvements. [ ][ ]
    • Rename the Openssl command class to OpenSSLPKCS7 to accommodate other command names with this prefix. [ ]
  • Misc:
    • Rename the --debugger command-line argument to --pdb. [ ]
    • Normalise filesystem stat(2) birth times (ie. st_birthtime) in the same way we do with the stat(1) command s Access: and Change: times to fix a nondeterministic build failure in GNU Guix. (#74)
    • Ignore case when ordering our file format descriptions. [ ]
    • Drop, add and tidy various module imports. [ ][ ][ ][ ]
In addition:
  • Jean-Romain Garnier fixed a general issue where, for example, LibarchiveMember s has_same_content method was called regardless of the underlying type of file. [ ]
  • Daniel Fullmer fixed an issue where some filesystems could only be mounted read-only. (!49)
  • Emanuel Bronshtein provided a patch to prevent a build of the Docker image containing parts of the build s. (#123)
  • Mattia Rizzolo added an entry to debian/py3dist-overrides to ensure the rpm-python module is used in package dependencies (#89) and moved to using the new execute_after_* and execute_before_* Debhelper rules [ ].

Chris Lamb also performed a huge overhaul of diffoscope s website:
  • Add a completely new design. [ ][ ]
  • Dynamically generate our contributor list [ ] and supported file formats [ ] from the main Git repository.
  • Add a separate, canonical page for every new release. [ ][ ][ ]
  • Generate a latest release section and display that with the corresponding date on the homepage. [ ]
  • Add an RSS feed of our releases [ ][ ][ ][ ][ ] and add to Planet Debian [ ].
  • Use Jekyll s absolute_url and relative_url where possible [ ][ ] and move a number of configuration variables to _config.yml [ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Other tools Elsewhere in our tooling: strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. In May, Chris Lamb uploaded version 1.8.1-1 to Debian unstable and Bernhard M. Wiedemann fixed an off-by-one error when parsing PNG image modification times. (#16) In disorderfs, our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues, Chris Lamb replaced the term dirents in place of directory entries in human-readable output/log messages [ ] and used the astyle source code formatter with the default settings to the main disorderfs.cpp source file [ ]. Holger Levsen bumped the debhelper-compat level to 13 in disorderfs [ ] and reprotest [ ], and for the GNU Guix distribution Vagrant Cascadian updated the versions of disorderfs to version 0.5.10 [ ] and diffoscope to version 145 [ ].

Project documentation & website
  • Carl Dong:
  • Chris Lamb:
    • Rename the Who page to Projects . [ ]
    • Ensure that Jekyll enters the _docs subdirectory to find the _docs/index.md file after an internal move. (#27)
    • Wrap ltmain.sh etc. in preformatted quotes. [ ]
    • Wrap the SOURCE_DATE_EPOCH Python examples onto more lines to prevent visual overflow on the page. [ ]
    • Correct a preferred spelling error. [ ]
  • Holger Levsen:
    • Sort our Academic publications page by publication year [ ] and add Trusting Trust and Fully Countering Trusting Trust through Diverse Double-Compiling [ ].
  • Juri Dispan:

Testing framework We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org that, amongst many other tasks, tracks the status of our reproducibility efforts as well as identifies any regressions that have been introduced. Holger Levsen made the following changes:
  • System health status:
    • Improve page description. [ ]
    • Add more weight to proxy failures. [ ]
    • More verbose debug/failure messages. [ ][ ][ ]
    • Work around strangeness in the Bash shell let VARIABLE=0 exits with an error. [ ]
  • Debian:
    • Fail loudly if there are more than three .buildinfo files with the same name. [ ]
    • Fix a typo which prevented /usr merge variation on Debian unstable. [ ]
    • Temporarily ignore PHP s horde](https://www.horde.org/) packages in Debian bullseye. [ ]
    • Document how to reboot all nodes in parallel, working around molly-guard. [ ]
  • Further work on a Debian package rebuilder:
    • Workaround and document various issues in the debrebuild script. [ ][ ][ ][ ]
    • Improve output in the case of errors. [ ][ ][ ][ ]
    • Improve documentation and future goals [ ][ ][ ][ ], in particular documentiing two real world tests case for an impossible to recreate build environment [ ].
    • Find the right source package to rebuild. [ ]
    • Increase the frequency we run the script. [ ][ ][ ][ ]
    • Improve downloading and selection of the sources to build. [ ][ ][ ]
    • Improve version string handling.. [ ]
    • Handle build failures better. [ ]. [ ]. [ ]
    • Also consider architecture all .buildinfo files. [ ][ ]
In addition:
  • kpcyrd, for Alpine Linux, updated the alpine_schroot.sh script now that a patch for abuild had been released upstream. [ ]
  • Alexander Couzens of the OpenWrt project renamed the brcm47xx target to bcm47xx. [ ]
  • Mattia Rizzolo fixed the printing of the build environment during the second build [ ][ ][ ] and made a number of improvements to the script that deploys Jenkins across our infrastructure [ ][ ][ ].
Lastly, Vagrant Cascadian clarified in the documentation that you need to be user jenkins to run the blacklist command [ ] and the usual build node maintenance was performed was performed by Holger Levsen [ ][ ][ ], Mattia Rizzolo [ ][ ] and Vagrant Cascadian [ ][ ][ ].

Mailing list: There were a number of discussions on our mailing list this month: Paul Spooren started a thread titled Reproducible Builds Verification Format which reopens the discussion around a schema for sharing the results from distributed rebuilders:
To make the results accessible, storable and create tools around them, they should all follow the same schema, a reproducible builds verification format. The format tries to be as generic as possible to cover all open source projects offering precompiled source code. It stores the rebuilder results of what is reproducible and what not.
Hans-Christoph Steiner of the Guardian Project also continued his previous discussion regarding making our website translatable. Lastly, Leo Wandersleb posted a detailed request for feedback on a question of supply chain security and other issues of software review; Leo is the founder of the Wallet Scrutiny project which aims to prove the security of Android Bitcoin Wallets:
Do you own your Bitcoins or do you trust that your app allows you to use your coins while they are actually controlled by them ? Do you have a backup? Do they have a copy they didn t tell you about? Did anybody check the wallet for deliberate backdoors or vulnerabilities? Could anybody check the wallet for those?
Elsewhere, Leo had posted instructions on his attempts to reproduce the binaries for the BlueWallet Bitcoin wallet for iOS and Android platforms.


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

This month s report was written by Bernhard M. Wiedemann, Chris Lamb, Holger Levsen, Jelle van der Waa and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

30 May 2020

Reproducible Builds (diffoscope): diffoscope 146 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 146. This version includes the following changes:
[ Chris Lamb ]
* Refactor .changes and .buildinfo handling to show all details (including
  the GPG header and footer components), even when referenced files are not
  present. (Closes: reproducible-builds/diffoscope#122)
* Normalise filesystem stat(2) "birth times" (ie. st_birthtime) in the same
  way we do with stat(1)'s "Access:" and "Change:" times to fix a
  nondetermistic build failure on GNU Guix.
  (Closes: reproducible-builds/diffoscope#74)
* Drop the (default) subprocess.Popen(shell=False) keyword argument so that
  the more unsafe shell=True is more obvious.
* Ignore lower vs. upper-case when ordering our file format descriptions.
* Don't skip string normalisation in Black.
[ Mattia Rizzolo ]
* Add a "py3dist" override for the rpm-python module (Closes: #949598)
* Bump the debhelper compat level to 13 and use the new
  execute_after_*/execture_before_* style rules.
* Fix a spelling error in changelog.
[ Daniel Fullmer ]
* Mount GuestFS filesystem images readonly.
[ Jean-Romain Garnier ]
* Prevent an issue where (for example) LibarchiveMember's has_same_content
  method is called regardless of the actual type of file.
You find out more by visiting the project homepage.

6 May 2020

Reproducible Builds: Reproducible Builds in April 2020

Welcome to the April 2020 report from the Reproducible Builds project. In our regular reports we outline the most important things that we and the rest of the community have been up to over the past month. What are reproducible builds? One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. But whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into seemingly secure software during the various compilation and distribution processes.

News It was discovered that more than 725 malicious packages were downloaded thousands of times from RubyGems, the official channel for distributing code for the Ruby programming language. Attackers used a variation of typosquatting and replaced hyphens and underscores (for example, uploading a malevolent atlas-client in place of atlas_client) that executed a script that intercepted Bitcoin payments. (Ars Technica report) Bernhard M. Wiedemann launched ismypackagereproducibleyet.org, a service that takes a package name as input and displays whether the package is reproducible in a number of distributions. For example, it can quickly show the status of Perl as being reproducible on openSUSE but not in Debian. Bernhard also improved the documentation of his unreproducible package to add some example patches for hash issues. [ ]. There was a post on Chaos Computer Club s website listing Ten requirements for the evaluation of Contact Tracing apps in relation to the SARS-CoV-2 epidemic. In particular:
4. Transparency and verifiability: The complete source code for the app and infrastructure must be freely available without access restrictions to allow audits by all interested parties. Reproducible build techniques must be used to ensure that users can verify that the app they download has been built from the audited source code.
Elsewhere, Nicolas Boulenguez wrote a patch for the Ada programming language component of the GCC compiler to skip -f.*-prefix-map options when writing Ada Library Information files. Amongst other properties, these .ali files embed the compiler flags used at the time of the build which results in the absolute build path being recorded via -ffile-prefix-map, -fdebug-prefix-map, etc. In the Arch Linux project, kpcyrd reported that they held their first rebuilder workshop . The session was held on IRC and participants were provided a document with instructions on how to install and use Arch s repro tool. The meeting resulted in multiple people with no prior experience of Reproducible Builds validate their first package. Later in the month he also announced that it was now possible to run independent rebuilders under Arch in a hands-off, everything just works solution to distributed package verification. Mathias Lang submitted a pull request against dmd, the canonical compiler for the D programming languageto add support for our SOURCE_DATE_EPOCH environment variable as well the other C preprocessor tokens such __DATE__, __TIME__ and __TIMESTAMP__ which was subsequently merged. SOURCE_DATE_EPOCH defines a distribution-agnostic standard for build toolchains to consume and emit timestamps in situations where they are deemed to be necessary. [ ] The Telegram instant-messaging platform announced that they had updated to version 5.1.1 continuing their claim that they are reproducible according to their full instructions and therefore verifying that its original source code is exactly the same code that is used to build the versions available on the Apple App Store and Google Play distribution platforms respectfully. Lastly, Herv Boutemy reported that 97% of the current development versions of various Maven packages appear to have a reproducible build. [ ]

Distribution work In Debian this month, 89 reviews of Debian packages were added, 21 were updated and 33 were removed this month adding to our knowledge about identified issues. Many issue types were noticed, categorised and updated by Chris Lamb, including: In addition, Holger Levsen filed a feature request against debrebuild, a tool for rebuilding a Debian package given a .buildinfo file, proposing to add --standalone or --one-shot-mode functionality.
In openSUSE, Bernhard M. Wiedemann made the following changes: In Arch Linux, a rebuilder instance has been setup at reproducible.archlinux.org that is rebuilding Arch s [core] repository directly. The first rebuild has led to approximately 90% packages reproducible contrasting with 94% on the Reproducible Build s project own ArchLinux status page on tests.reproducible-builds.org that continiously builds packages and does not verify Arch Linux packages. More information may be found on the corresponding wiki page and the underlying decisions were explained on our mailing list.

Software development

diffoscope Chris Lamb made the following changes to diffoscope, the Reproducible Builds project s in-depth and content-aware diff utility that can locate and diagnose reproducibility issues (including preparing and uploading versions 139, 140, 141, 142 and 143 to Debian which were subsequently uploaded to the backports repository):
  • Comparison improvements:
    • Dalvik .dex files can also serve as APK containers so restrict the narrower identification of .dex files to files ending with this extension and widen the identification of APK files to when file(1) discovers a Dalvik file. (#28)
    • Add support for Hierarchical Data Format (HD5) files. (#95)
    • Add support for .p7c and .p7b certificates. (#94)
    • Strip paths from the output of zipinfo(1) warnings. (#97)
    • Don t uselessly include the JSON similarity percentage if it is 0.0% . [ ]
    • Render multi-line difference comments in a way to show indentation. (#101)
  • Testsuite improvements:
    • Add pdftotext as a requirement to run the PDF test_metadata text. (#99)
    • apktool 2.5.0 changed the handling of output of XML schemas so update and restrict the corresponding test to match. (#96)
    • Explicitly list python3-h5py in debian/tests/control.in to ensure that we have this module installed during a test run to generate the fixtures in these tests. [ ]
    • Correct parsing of ./setup.py test --pytest-args arguments. [ ]
  • Misc:
    • Capitalise Ordering differences only in text comparison comments. [ ]
    • Improve documentation of FILE_TYPE_HEADER_PREFIX and FALLBACK_FILE_TYPE_HEADER_PREFIX to highlight that only the first 16 bytes are used. [ ]
Michael Osipov created a well-researched merge request to return diffoscope to using zipinfo directly instead of piping input via /dev/stdin in order to ensure portability to the BSD operating system [ ]. In addition, Ben Hutchings documented how --exclude arguments are matched against filenames [ ] and Jelle van der Waa updated the LLVM test fixture difference for LLVM version 10 [ ] as well as adding a reference to the name of the h5dump tool in Arch Linux [ ]. Lastly, Mattia Rizzolo also fixed in incorrect build dependency [ ] and Vagrant Cascadian enabled diffoscope to locate the openssl and h5dump packages on GNU Guix [ ][ ], and updated diffoscope in GNU Guix to version 141 [ ] and 143 [ ].

strip-nondeterminism strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. In April, Chris Lamb made the following changes:
  • Add deprecation plans to all handlers documenting how or if they could be disabled and eventually removed, etc. (#3)
  • Normalise *.sym files as Java archives. (#15)
  • Add support for custom .zip filename filtering and exclude two patterns of files generated by Maven projects in fork mode. (#13)

disorderfs disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues. This month, Chris Lamb fixed a long-standing issue by not drop UNIX groups in FUSE multi-user mode when we are not root (#1) and uploaded version 0.5.9-1 to Debian unstable. Vagrant Cascadian subsequently refreshed disorderfs in GNU Guix to version 0.5.9 [ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: In addition, Bernhard informed the following projects that their packages are not reproducible:
  • acoular (report unknown non-determinism)
  • cri-o (report a date issue)
  • gnutls (report certtool being unable to extend certificates beyond 2049)
  • gnutls (report copyright year variation)
  • libxslt (report a bug about non-deterministic output from data corruption)
  • python-astropy (report a future build failure in 2021)

Project documentation This month, Chris Lamb made a large number of changes to our website and documentation in the following categories:
  • Community engagement improvements:
    • Update instructions to register for Salsa on our Contribute page now that the signup process has been overhauled. [ ]
    • Make it clearer that joining the rb-general mailing list is probably a first step for contributors to take. [ ]
    • Make our full contact information easier to find in the footer (#19) and improve text layout using bullets to separate sections [ ].
  • Accessibility:
    • To improve accessibility, make all links underlined. (#12)
    • Use an enhanced foreground/background contrast ratio of 7.04:1. (#11)
  • General improvements:
  • Internals:
    • Move to using jekyll-redirect-from over manual redirect pages [ ][ ] and add a redirect from /docs/buildinfo/ to /docs/recording/. (#23)
    • Limit the website self-check to not scan generated files [ ] and remove the old layout checker now that I have migrated all them [ ].
    • Move the news archive under the /news/ namespace [ ] and improve formatting of archived news links [ ].
    • Various improvements to the draft template generation. [ ][ ][ ][ ]
In addition, Holger Levsen clarified exactly which month we ceased to do weekly reports [ ] and Mattia Rizzolo adjusted the title style of an event page [ ]. Marcus Hoffman also started a discussion on our website s issue tracker asking for clarification on embedded signatures and Chris Lamb subsequently replied and asked Marcus to go ahead and propose a concrete change.

Testing framework We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org that, amongst many other tasks, tracks the status of our reproducibility efforts as well as identifies any regressions that have been introduced.
  • Chris Lamb:
    • Print the build environment prior to executing a build. [ ]
    • Drop a misleading disorderfs-debug prefix in log output when we change non-disorderfs things in the file and, as it happens, do not run disorderfs at all. [ ]
    • The CSS for the package report pages added a margin to all <a> HTML elements under <li> ones, which was causing a comma/bullet spacing issue. [ ]
    • Tidy the copy in the project links sidebar. [ ]
  • Holger Levsen:
    • General:
    • Debian:
      • Reduce scheduling frequency of the buster distribution on the arm64 architecture, etc.. [ ][ ]
      • Show builds per day on a per-architecture basis for the last year on the Debian dashboard. [ ]
      • Drop the Subgraph OS package set as development halted in 2017 or 2018. [ ]
      • Update debrebuild to version from the latest version of devscripts. [ ][ ]
      • Add or improve various parts of the documentation. [ ][ ][ ]
    • Work on a Debian rebuilder:
      • Integrate sbuild. [ ][ ][ ][ ][ ]
      • Select a random .buildinfo file and attempt to build and compare the result. [ ][ ][ ][ ]
      • Improve output and related output formatting. [ ][ ][ ][ ][ ]
      • Outline next steps for the development of the tool. [ ][ ][ ]
      • Various refactoring and code improvements. [ ][ ][ ]
Lastly, Mattia Rizzolo fixed some log parsing code regarding potentially-harmless warnings from package installation [ ][ ] and the usual build node maintenance was performed by Holger Levsen [ ][ ][ ] and Mattia Rizzolo [ ][ ][ ].

Misc news On our mailing list this month, Santiago Torres asked whether we were still publishing releases of our tools to our website and Chris Lamb replied that this was not the case and fixed the issue. Later in the month Santiago also reported that the signature for the disorderfs package did not pass its GPG verification which was also fixed by Chris Lamb. Hans-Christoph Steiner of the Guardian Project asked whether there would be interest in making our website translatable which resulted in a WIP merge request being filed against the website and a discussion on how to track translation updates.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

This month s report was written by Bernhard M. Wiedemann, Chris Lamb, Daniel Shahaf, Holger Levsen, Jelle van der Waa, kpcyrd, Mattia Rizzolo and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

7 April 2020

Reproducible Builds: Reproducible Builds in March 2020

Welcome to the March 2020 report from the Reproducible Builds project. In our reports we outline the most important things that we have been up to over the past month and some plans for the future.
What are reproducible builds? One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes.

News The report from our recent summit in Marrakesh was published and is now available in both PDF and HTML formats. A sincere thank you to all of the Reproducible Builds community for the input to the event a sincere thank you to Aspiration for preparing and collating this report. Harmut Schorrig published a detailed document on how to compile Java applications in such as way that the .jar build artefact is reproducible across builds. A practical and hands-on guide, it details how to avoid unnecessary differences between builds by explicitly declaring an encoding as the default value differs across Linux and MS Windows systems and ensuring that the generated .jar a variant of a .zip archive does not embed any nondeterministic filesystem metadata, and so on. Janneke gave a quick presentation on GNU Mes and reproducible builds during the lighting talk session at LibrePlanet 2020. [ ] Vagrant Cascadian presented There and Back Again, Reproducibly! video at SCaLE 18x in Pasadena in California which generated some attention on Twitter. Herv Boutemy mentioned on our mailing list in a thread titled Rebuilding and checking Reproducible Builds from Maven Central repository that since the update of a central build script (the parent POM ) every Apache project using the Maven build system should build reproducibly. A follow-up discussion regarding how to perform such rebuilds was also started on the Apache mailing list. The Telegram instant-messaging platform announced that they had updated their iOS and Android OS applications and claim that they are reproducible according to their full instructions, verifying that its original source code is exactly the same code that is used to build the versions available on the Apple App Store and Google Play distribution platforms respectfully. Herv Boutemy also reported about a new project called reproducible-central which aims to allow anyone to rebuild a component from the Maven Central Repository that is expected to be reproducible and check that the result is as expected. In last month s report we detailed Omar Navarro Leija s work in and around an academic paper titled Reproducible Containers which describes in detail the workings of a user-space container tool called dettrace (PDF). Since then, the PhD student from the University Of Pennsylvania presented on this tool at the ASPLOS 2020 conference in Lausanne, Switzerland. Furthermore, there were contributions to dettrace from the Reproducible Builds community itself. [ ][ ]

Distribution work

openSUSE In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update as well as made the following changes within the distribution itself:

Debian Chris Lamb further refined his merge request for the debian-installer component to allow all arguments from sources.list files (such as [check-valid-until=no] ) in order that we can test the reproducibility of the installer images on the Reproducible Builds own testing infrastructure. (#13) Holger Levsen filed a number of bug reports against the debrebuild tool that attempts to rebuild a Debian package given a .buildinfo file as input, including: 48 reviews of Debian packages were added, 17 were updated and 34 were removed this month adding to our knowledge about identified issues. Many issue types were noticed, categorised and updated by Chris Lamb, including: Finally, Holger opened a bug report against the software running tracker.debian.org, a service for Debian Developers to follow the evolution of packages via web and email interfaces to request that they integrate information from buildinfos.debian.net (#955434) and Chris Lamb kept isdebianreproducibleyet.com up to date. [ ]

Software development

diffoscope Chris Lamb made the following changes to diffoscope, the Reproducible Builds project s in-depth and content-aware diff utility that can locate and diagnose reproducibility issues, including preparing and uploading version 138 to Debian:
  • Improvements:
    • Don t allow errors with R script deserialisation cause the entire operation to fail, for example if an external library cannot be loaded. (#91)
    • Experiment with memoising output from expensive external commands, eg. readelf. (#93)
    • Use dumppdf from the python3-pdfminer if we do not see any other differences from pdftext, etc. (#92)
    • Prevent a traceback when comparing two R .rdx files directly as the get_member method will return a file even if the file is missing. [ ]
  • Reporting:
    • Display the supported file formats into the package long description. (#90)
    • Print a potentially-helpful message if the PyPDF2 module is not installed. [ ]
    • Remove any duplicate comparator descriptions when formatting in the --help output or in the package long description. [ ]
    • Weaken Install the X package to get a better output message to may produce a better output as the former is not actually guaranteed. [ ]
  • Misc:
    • Ensure we only parse the recommended packages from --list-debian-substvars when we want them for debian/tests/control generation. [ ]
    • Add upstream metadata file [ ] and add a Lintian override for upstream-metadata-in-native-source as we are upstream. [ ]
    • Inline the RequiredToolNotFound.get_package method s functionality as it is only used once. [ ]
    • Drop the deprecated py36 = [..] argument in the pyproject.toml file. [ ]
In addition, Vagrant Cascadian updated diffoscope in GNU Guix to version 138 [ ], as well as updating reprotest our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences to version 0.7.14 [ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month we wrote a large number of such patches, including:

Project documentation There was further work performed on our documentation and website this month including Alex Wilson adding a section regarding using Gradle for reproducible builds in JVM projects [ ] and Holger Levsen added the report from our recent summit in Marrakesh [ ][ ]. In addition, Chris Lamb made a number of changes, including correcting the syntax of some CSS class formatting [ ], improved some filed against copy a little better [ ] and corrected a reference to calendar.monthrange Python method in a utility function. [ ]

Testing framework We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org that, amongst many other tasks, tracks the status of our reproducibility efforts as well as identifies any regressions that have been introduced. This month, Chris Lamb reworked the web-based package rescheduling tool to:
  • Require a HTTP POST method in the web-based scheduler as not only should HTTP GET requests be idempotent but this will allow many future improvements in the user interface. [ ][ ][ ]
  • Improve the authentication error message in said rescheduler to suggest that the developer s SSL certificate may have expired. [ ]
In addition, Holger Levsen made the following changes:
  • Add a new ath97 subtarget for the OpenWrt distribution.
  • Revisit ordering of Debian suites; sort the experimental distribution last and reverse the ordering of suites to prioritise the suites in development. [ ][ ][ ]
  • Schedule Debian buster and bullseye a little less in order to allow unstable to catch up on the i386 architecture. [ ][ ]
  • Various cosmetic changes to the web-based scheduler. [ ][ ][ ][ ]
  • Improve wordings in the node health maintenance output. [ ]
Lastly, Vagrant Cascadian updated a link to the (formerly) weekly news to our reports page [ ] and kpcyrd fixed the escaping in an Alpine Linux inline patch [ ]. The usual build nodes maintenance was performed by Holger Levsen [ ][ ], Mattia Rizzolo [ ] and Vagrant Cascadian [ ][ ].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

This month s report was written by Bernhard M. Wiedemann, Chris Lamb, Holger Levsen and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

6 March 2020

Reproducible Builds: Reproducible Builds in February 2020

Welcome to the February 2020 report from the Reproducible Builds project. One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes. The motivation behind the reproducible builds effort is to provide the ability to demonstrate these binaries originated from a particular, trusted, source release: if identical results are generated from a given source in all circumstances, reproducible builds provides the means for multiple third-parties to reach a consensus on whether a build was compromised via distributed checksum validation or some other scheme. In this month s report, we cover:

If you are interested in contributing to the project, please visit our Contribute page on our website.

Media coverage & upstream news Omar Navarro Leija, a PhD student at the University Of Pennsylvania, published a paper entitled Reproducible Containers that describes in detail the workings of a new user-space container tool called DetTrace:
All computation that occurs inside a DetTrace container is a pure function of the initial filesystem state of the container. Reproducible containers can be used for a variety of purposes, including replication for fault-tolerance, reproducible software builds and reproducible data analytics. We use DetTrace to achieve, in an automatic fashion, reproducibility for 12,130 Debian package builds, containing over 800 million lines of code, as well as bioinformatics and machine learning workflows.
There was also considerable discussion on our mailing list regarding this research and a presentation based on the paper will occur at the ASPLOS 2020 conference between March 16th 20th in Lausanne, Switzerland. The many virtues of Reproducible Builds were touted as benefits for software compliance in a talk at FOSDEM 2020, debating whether the Careful Inventory of Licensing Bill of Materials Have Impact of FOSS License Compliance which pitted Jeff McAffer and Carol Smith against Bradley Kuhn and Max Sills. (~47 minutes in). Nobuyoshi Nakada updated the canonical implementation of the Ruby programming language a change such that filesystem globs (ie. calls to list the contents of filesystem directories) will henceforth be sorted in ascending order. Without this change, the underlying nondeterministic ordering of the filesystem is exposed to the language which often results in an unreproducible build. Vagrant Cascadian reported on our mailing list regarding a quick reproducible test for the GNU Guix distribution, which resulted in 81.9% of packages registering as reproducible in his installation:
$ guix challenge --verbose --diff=diffoscope ...
2,463 store items were analyzed:
  - 2,016 (81.9%) were identical
  - 37 (1.5%) differed
  - 410 (16.6%) were inconclusive
Jeremiah Orians announced on our mailing list the release of a number of tools related to cross-compilation such as M2-Planet and mescc-tools-seed. This project attemps a full bootstrap of a cross-platform compiler for the C programming language (written in C itself) from hex, the ultimate goal being able to demonstrate fully-bootstrapped compiler from hex to the GCC GNU Compiler Collection. This has many implications in and around Ken Thompson s Trusting Trust attack outlined in Thompson s 1983 Turing Award Lecture. Twitter user @TheYoctoJester posted an executive summary of reproducible builds in the Yocto Project: Finally, Reddit user tofflos posted to the /r/Java subreddit asking about how to achieve reproducible builds with Maven and Chris Lamb noticed that the Linux kernel documentation about reproducible builds of it is available on the kernel.org homepages in an attractive HTML format.

Distribution work

Debian Chris Lamb created a merge request for the core debian-installer package to allow all arguments and options from sources.list files (such as [check-valid-until=no] , etc.) in order that we can test the reproducibility of the installer images on the Reproducible Builds own testing infrastructure. (#13) Thorsten Glaser followed-up to a bug filed against the dpkg-source component that was originally filed in late 2015 that claims that the build tool does not respect permissions when unpacking tarballs if the umask is set to 0002. Matthew Garrett posted to the debian-devel mailing list on the topic of Producing verifiable initramfs images as part of a wider conversation on being able to trust the entire software stack on our computers. 59 reviews of Debian packages were added, 30 were updated and 42 were removed this month adding to our knowledge about identified issues. Many issue types were noticed and categorised by Chris Lamb, including:

openSUSE In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update as well as provided the following patches:

Software development

diffoscope diffoscope is our in-depth and content-aware diff-like utility that can locate and diagnose reproducibility issues. It is run countless times a day on our testing infrastructure and is essential for identifying fixes and causes of nondeterministic behaviour. Chris Lamb made the following changes this month, including uploading version 137 to Debian:
  • The sng image utility appears to return with an exit code of 1 if there are even minor errors in the file. (#950806)
  • Also extract classes2.dex, classes3.dex from .apk files extracted by apktool. (#88)
  • No need to use str.format if we are just returning the string. [ ]
  • Add generalised support for ignoring returncodes [ ] and move special-casing of returncodes in zip to use Command.VALID_RETURNCODES. [ ]

Other tools disorderfs is our FUSE-based filesystem that deliberately introduces non-determinism into directory system calls in order to flush out reproducibility issues. This month, Vagrant Cascadian updated the Vcs-Git to specify the debian packaging branch. [ ] reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, versions 0.7.13 and 0.7.14 were uploaded to Debian unstable by Holger Levsen after Vagrant Cascadian added support for GNU Guix [ ].

Project documentation & website There was more work performed on our documentation and website this month. Bernhard M. Wiedemann added a Java Gradle Build Tool snippet to the SOURCE_DATE_EPOCH documentation [ ] and normalised various terms to unreproducible [ ]. Chris Lamb added a Meson.build example [ ] and improved the documentation for the CMake [ ] to the SOURCE_DATE_EPOCH documentation, replaced anyone can with anyone may as, well, not everyone has the resources, skills, time or funding to actually do what it refers to [ ] and improved the pre-processing for our report generation [ ][ ][ ][ ] etc. In addition, Holger Levsen updated our news page to improve the list of reports [ ], added an explicit mention of the weekly news time span [ ] and reverted sorting of news entries to have latest on top [ ] and Mattia Rizzolo added Codethink as a non-fiscal sponsor [ ] and lastly Tianon Gravi added a Docker Images link underneath the Debian project on our Projects page [ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Vagrant Cascadian submitted patches via the Debian bug tracking system targeting the packages the Civil Infrastructure Platform has identified via the CIP and CIP build depends package sets:

Testing framework We operate a fully-featured and comprehensive Jenkins-based testing framework that powers tests.reproducible-builds.org. This month, the following changes were made by Holger Levsen: In addition, Mattia Rizzolo added an Apache web server redirect for buildinfos.debian.net [ ] and reverted the reshuffling of arm64 architecture builders [ ]. The usual build node maintenance was performed by Holger Levsen, Mattia Rizzolo [ ][ ] and Vagrant Cascadian.

Getting in touch If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

This month s report was written by Bernhard M. Wiedemann, Chris Lamb and Holger Levsen. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

Next.

Previous.